1,790 research outputs found

    Comparative modeling of mainly-beta proteins by profile wrapping

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 61-67).The ability to predict structure from sequence is particularly important for toxins, virulence factors, allergens, cytokines, and other proteins of public heath importance. Many such functions are represented in the parallel [beta]-helix fold class. Structure prediction for this fold is a challenging computational problem because there exists very little sequence similarity (less than 15%) across the SCOP family. This thesis introduces BetaWrapPro, a program for comparative modeling of the parallel -helix fold. By estimating pairwise [beta]-strand interaction probabilities, a profile of the target sequence is aligned, or "wrapped," onto al abstract supersecondary structural template. This wrapping procedure may capture folding processes that have al initiation stage' followed by processive, interaction between the unfolded region and the already-formed substructure. This wrap is then placed on a known structure and side-chains are modeled to produce a three-dimensional structure prediction. We demonstrate that wrapping onto an abstract template produces accurate structure predictions for this fold (ill cross-validation: average C0 RMSD of 1.55 A in accurately wrapped regions, with 88% of the residues accurately aligned). In addition, BetaWrapPro outperforms other fold recognition methods, recognizing the .l-helix fold( with 1]00% sensitivity at 99.7% specificity in cross-validation on the PDB. One striking result has been the prediction of an unexpected parallel -helix structure for a. pollen allergen, and its recent confirmation through solution of its structure.by Nathan Patrick Palmer.S.M

    Data mining techniques for large-scale gene expression analysis

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 238-256).Modern computational biology is awash in large-scale data mining problems. Several high-throughput technologies have been developed that enable us, with relative ease and little expense, to evaluate the coordinated expression levels of tens of thousands of genes, evaluate hundreds of thousands of single-nucleotide polymorphisms, and sequence individual genomes. The data produced by these assays has provided the research and commercial communities with the opportunity to derive improved clinical prognostic indicators, as well as develop an understanding, at the molecular level, of the systemic underpinnings of a variety of diseases. Aside from the statistical methods used to evaluate these assays, another, more subtle challenge is emerging. Despite the explosive growth in the amount of data being generated and submitted to the various publicly available data repositories, very little attention has been paid to managing the phenotypic characterization of their samples (i.e., managing class labels in a controlled fashion). If sense is to be made of the underlying assay data, the samples' descriptive metadata must first be standardized in a machine-readable format. In this thesis, we explore these issues, specifically within the context of curating and analyzing a large DNA microarray database. We address three main challenges. First, we acquire a large subset of a publicly available microarray repository and develop a principled method for extracting phenotype information from freetext sample labels, then use that information to generate an index of the sample's medically-relevant annotation. The indexing method we develop, Concordia, incorporates pre-existing expert knowledge relating to the hierarchical relationships between medical terms, allowing queries of arbitrary specificity to be efficiently answered. Second, we describe a highly flexible approach to answering the question: "Given a previously unseen gene expression sample, how can we compute its similarity to all of the labeled samples in our database, and how can we utilize those similarity scores to predict the phenotype of the new sample?" Third, we describe a method for identifying phenotype-specific transcriptional profiles within the context of this database, and explore a method for measuring the relative strength of those signatures across the rest of the database, allowing us to identify molecular signatures that are shared across various tissues ad diseases. These shared fingerprints may form a quantitative basis for optimal therapy selection and drug repositioning for a variety of diseases.by Nathan Patrick Palmer.Ph.D

    A Practical Platform for Blood Biomarker Study by Using Global Gene Expression Profiling of Peripheral Whole Blood

    Get PDF
    Background: Although microarray technology has become the most common method for studying global gene expression, a plethora of technical factors across the experiment contribute to the variable of genome gene expression profiling using peripheral whole blood. A practical platform needs to be established in order to obtain reliable and reproducible data to meet clinical requirements for biomarker study. Methods and Findings: We applied peripheral whole blood samples with globin reduction and performed genome-wide transcriptome analysis using Illumina BeadChips. Real-time PCR was subsequently used to evaluate the quality of array data and elucidate the mode in which hemoglobin interferes in gene expression profiling. We demonstrated that, when applied in the context of standard microarray processing procedures, globin reduction results in a consistent and significant increase in the quality of beadarray data. When compared to their pre-globin reduction counterparts, post-globin reduction samples show improved detection statistics, lowered variance and increased sensitivity. More importantly, gender gene separation is remarkably clearer in post-globin reduction samples than in pre-globin reduction samples. Our study suggests that the poor data obtained from pre-globin reduction samples is the result of the high concentration of hemoglobin derived from red blood cells either interfering with target mRNA binding or giving the pseudo binding background signal. Conclusion: We therefore recommend the combination of performing globin mRNA reduction in peripheral whole blood samples and hybridizing on Illumina BeadChips as the practical approach for biomarker study

    Discovery of novel heart rate-associated loci using the Exome Chip

    Get PDF
    Resting heart rate is a heritable trait, and an increase in heart rate is associated with increased mortality risk. Genome-wide association study analyses have found loci associated with resting heart rate, at the time of our study these loci explained 0.9% of the variation. This study aims to discover new genetic loci associated with heart rate from Exome Chip meta-analyses. Heart rate was measured from either elecrtrocardiograms or pulse recordings. We meta-analysed heart rate association results from 104 452 European-ancestry individuals from 30 cohorts, genotyped using the Exome Chip. Twenty-four variants were selected for follow-up in an independent dataset (UK Biobank, N = 134 251). Conditional and gene-based testing was undertaken, and variants were investigated with bioinformatics methods. We discovered five novel heart rate loci, and one new independent low-frequency non-synonymous variant in an established heart rate locus (KIAA1755). Lead variants in four of the novel loci are non-synonymous variants in the genes C10orf71, DALDR3, TESK2 and SEC31B. The variant at SEC31B is significantly associated with SEC31B expression in heart and tibial nerve tissue. Further candidate genes were detected from long-range regulatory chromatin interactions in heart tissue (SCD, SLF2 and MAPK8). We observed significant enrichment in DNase I hypersensitive sites in fetal heart and lung. Moreover, enrichment was seen for the first time in human neuronal progenitor cells (derived from embryonic stem cells) and fetal muscle samples by including our novel variants. Our findings advance the knowledge of the genetic architecture of heart rate, and indicate new candidate genes for follow-up functional studies

    Genetic determinants of telomere length from 109,122 ancestrally diverse whole-genome sequences in TOPMed

    Get PDF
    Genetic studies on telomere length are important for understanding age-related diseases. Prior GWAS for leukocyte TL have been limited to European and Asian populations. Here, we report the first sequencing-based association study for TL across ancestrally-diverse individuals (European, African, Asian and Hispanic/Latino) from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. We used whole genome sequencing (WGS) of whole blood for variant genotype calling and the bioinformatic estimation of telomere length in n=109,122 individuals. We identified 59 sentinel variants (p-value OBFC1indicated the independent signals colocalized with cell-type specific eQTLs for OBFC1 (STN1). Using a multi-variant gene-based approach, we identified two genes newly implicated in telomere length, DCLRE1B (SNM1B) and PARN. In PheWAS, we demonstrated our TL polygenic trait scores (PTS) were associated with increased risk of cancer-related phenotypes

    Genetic diversity fuels gene discovery for tobacco and alcohol use

    Get PDF
    Tobacco and alcohol use are heritable behaviours associated with 15% and 5.3% of worldwide deaths, respectively, due largely to broad increased risk for disease and injury(1-4). These substances are used across the globe, yet genome-wide association studies have focused largely on individuals of European ancestries(5). Here we leveraged global genetic diversity across 3.4 million individuals from four major clines of global ancestry (approximately 21% non-European) to power the discovery and fine-mapping of genomic loci associated with tobacco and alcohol use, to inform function of these loci via ancestry-aware transcriptome-wide association studies, and to evaluate the genetic architecture and predictive power of polygenic risk within and across populations. We found that increases in sample size and genetic diversity improved locus identification and fine-mapping resolution, and that a large majority of the 3,823 associated variants (from 2,143 loci) showed consistent effect sizes across ancestry dimensions. However, polygenic risk scores developed in one ancestry performed poorly in others, highlighting the continued need to increase sample sizes of diverse ancestries to realize any potential benefit of polygenic prediction.Peer reviewe

    Optimasi Portofolio Resiko Menggunakan Model Markowitz MVO Dikaitkan dengan Keterbatasan Manusia dalam Memprediksi Masa Depan dalam Perspektif Al-Qur`an

    Full text link
    Risk portfolio on modern finance has become increasingly technical, requiring the use of sophisticated mathematical tools in both research and practice. Since companies cannot insure themselves completely against risk, as human incompetence in predicting the future precisely that written in Al-Quran surah Luqman verse 34, they have to manage it to yield an optimal portfolio. The objective here is to minimize the variance among all portfolios, or alternatively, to maximize expected return among all portfolios that has at least a certain expected return. Furthermore, this study focuses on optimizing risk portfolio so called Markowitz MVO (Mean-Variance Optimization). Some theoretical frameworks for analysis are arithmetic mean, geometric mean, variance, covariance, linear programming, and quadratic programming. Moreover, finding a minimum variance portfolio produces a convex quadratic programming, that is minimizing the objective function ðð¥with constraintsð ð 𥠥 ðandð´ð¥ = ð. The outcome of this research is the solution of optimal risk portofolio in some investments that could be finished smoothly using MATLAB R2007b software together with its graphic analysis

    Differential cross section measurements for the production of a W boson in association with jets in proton–proton collisions at √s = 7 TeV

    Get PDF
    Measurements are reported of differential cross sections for the production of a W boson, which decays into a muon and a neutrino, in association with jets, as a function of several variables, including the transverse momenta (pT) and pseudorapidities of the four leading jets, the scalar sum of jet transverse momenta (HT), and the difference in azimuthal angle between the directions of each jet and the muon. The data sample of pp collisions at a centre-of-mass energy of 7 TeV was collected with the CMS detector at the LHC and corresponds to an integrated luminosity of 5.0 fb[superscript −1]. The measured cross sections are compared to predictions from Monte Carlo generators, MadGraph + pythia and sherpa, and to next-to-leading-order calculations from BlackHat + sherpa. The differential cross sections are found to be in agreement with the predictions, apart from the pT distributions of the leading jets at high pT values, the distributions of the HT at high-HT and low jet multiplicity, and the distribution of the difference in azimuthal angle between the leading jet and the muon at low values.United States. Dept. of EnergyNational Science Foundation (U.S.)Alfred P. Sloan Foundatio

    Impacts of the Tropical Pacific/Indian Oceans on the Seasonal Cycle of the West African Monsoon

    Get PDF
    The current consensus is that drought has developed in the Sahel during the second half of the twentieth century as a result of remote effects of oceanic anomalies amplified by local land–atmosphere interactions. This paper focuses on the impacts of oceanic anomalies upon West African climate and specifically aims to identify those from SST anomalies in the Pacific/Indian Oceans during spring and summer seasons, when they were significant. Idealized sensitivity experiments are performed with four atmospheric general circulation models (AGCMs). The prescribed SST patterns used in the AGCMs are based on the leading mode of covariability between SST anomalies over the Pacific/Indian Oceans and summer rainfall over West Africa. The results show that such oceanic anomalies in the Pacific/Indian Ocean lead to a northward shift of an anomalous dry belt from the Gulf of Guinea to the Sahel as the season advances. In the Sahel, the magnitude of rainfall anomalies is comparable to that obtained by other authors using SST anomalies confined to the proximity of the Atlantic Ocean. The mechanism connecting the Pacific/Indian SST anomalies with West African rainfall has a strong seasonal cycle. In spring (May and June), anomalous subsidence develops over both the Maritime Continent and the equatorial Atlantic in response to the enhanced equatorial heating. Precipitation increases over continental West Africa in association with stronger zonal convergence of moisture. In addition, precipitation decreases over the Gulf of Guinea. During the monsoon peak (July and August), the SST anomalies move westward over the equatorial Pacific and the two regions where subsidence occurred earlier in the seasons merge over West Africa. The monsoon weakens and rainfall decreases over the Sahel, especially in August.Peer reviewe
    corecore